Chloroplast DNA ( cpDNA), also known as plastid DNA ( ptDNA) is the DNA located in , which are photosynthetic organelles located within the cells of some eukaryotic organisms. Chloroplasts, like other types of plastid, contain a genome separate from that in the cell Cell nucleus. The existence of chloroplast DNA was identified biochemically in 1959, and confirmed by electron microscopy in 1962. The discoveries that the chloroplast contains ribosomes and performs protein synthesis revealed that the chloroplast is genetically semi-autonomous. The first complete chloroplast genome sequences were published in 1986, Nicotiana tabacum (tobacco) by Sugiura and colleagues and Marchantia polymorpha (liverwort) by Ozeki et al. Since then, tens of thousands of chloroplast genomes from various species have been DNA sequencing.
Most chloroplasts have their entire chloroplast genome combined into a single large ring, though those of dinophyte algae are a notable exception—their genome is broken up into about forty small plasmids, each 2,000–10,000 base pairs long. Each minicircle contains one to three genes, but blank plasmids, with no coding DNA, have also been found.
Chloroplast DNA has long been thought to have a circular structure, but some evidence suggests that chloroplast DNA more commonly takes a linear shape. Over 95% of the chloroplast DNA in corn chloroplasts has been observed to be in branched linear form rather than individual circles.
The inverted repeats vary wildly in length, ranging from 4,000 to 25,000 base pairs long each. Inverted repeats in plants tend to be at the upper end of this range, each being 20,000–25,000 base pairs long. The inverted repeat regions usually contain three ribosomal RNA and two tRNA genes, but they can be expanded or genome reduction to contain as few as four or as many as over 150 genes. While a given pair of inverted repeats are rarely completely identical, they are always very similar to each other, apparently resulting from concerted evolution.
The inverted repeat regions are highly conserved among land plants, and accumulate few mutations. Similar inverted repeats exist in the genomes of cyanobacteria and the other two chloroplast lineages (glaucophyta and red algae), suggesting that they predate the chloroplast, though some chloroplast DNAs like those of peas and a few red algae have since lost the inverted repeats. Others, like the red alga Porphyra flipped one of its inverted repeats (making them direct repeats). It is possible that the inverted repeats help stabilize the rest of the chloroplast genome, as chloroplast DNAs which have lost some of the inverted repeat segments tend to get rearranged more.
Though chloroplast DNA is not associated with true , in red algae, a histone-like chloroplast protein (HC) coded by the chloroplast DNA that tightly packs each chloroplast DNA ring into a nucleoid has been found.
In primitive red algae, the chloroplast DNA nucleoids are clustered in the center of a chloroplast, while in green plants and green algae, the nucleoids are dispersed throughout the stroma.
In most plant species, the chloroplast genome encodes approximately 120 genes. The genes primarily encode core components of the photosynthetic machinery and factors involved in their expression and assembly. Across species of land plants, the set of genes encoded by the chloroplast genome is fairly conserved. This includes four ribosomal RNAs, approximately 30 tRNAs, 21 ribosomal proteins, and 4 subunits of the plastid-encoded RNA polymerase complex that are involved in plastid gene expression. The large Rubisco subunit and 28 photosynthetic thylakoid proteins are encoded within the chloroplast genome.
Endosymbiotic gene transfer is how we know about the lost chloroplasts in many chromalveolate lineages. Even if a chloroplast is eventually lost, the genes it donated to the former host's nucleus persist, providing evidence for the lost chloroplast's existence. For example, while diatoms (a heterokontophyte) now have a red algal derived chloroplast, the presence of many green algal genes in the diatom nucleus provide evidence that the diatom ancestor (probably the ancestor of all chromalveolates too) had a green algal derived chloroplast at some point, which was subsequently replaced by the red chloroplast.
In land plants, some 11–14% of the DNA in their nuclei can be traced back to the chloroplast, up to 18% in Arabidopsis, corresponding to about 4,500 protein-coding genes. There have been a few recent transfers of genes from the chloroplast DNA to the nuclear genome in land plants.
The editosome recognizes and binds to cis sequence upstream of the editing site. The distance between the binding site and editing site varies by gene and proteins involved in the editosome. Hundreds of different from the nuclear genome are involved in the RNA editing process. These proteins consist of 35-mer repeated amino acids, the sequence of which determines the cis binding site for the edited transcript.
Basal land plants such as liverworts, mosses and ferns have hundreds of different editing sites while flowering plants typically have between thirty and forty. Parasitic plants such as Epifagus virginiana show a loss of RNA editing resulting in a loss of function for photosynthesis genes.
In addition to the early microscopy experiments, this model is also supported by the amounts of deamination seen in cpDNA. Deamination occurs when an amino group is lost and is a mutation that often results in base changes. When adenine is deaminated, it becomes hypoxanthine (H). Hypoxanthine can bind to cytosine, and when the HC base pair is replicated, it becomes a GC (thus, an A → G base change). In cpDNA, there are several A → G deamination gradients. DNA becomes susceptible to deamination events when it is single stranded. When replication forks form, the strand not being copied is single stranded, and thus at risk for A → G deamination. Therefore, gradients in deamination indicate that replication forks were most likely present and the direction that they initially opened (the highest gradient is most likely nearest the start site because it was single stranded for the longest amount of time). This mechanism is still the leading theory today; however, a second theory suggests that most cpDNA is actually linear and replicates through homologous recombination. It further contends that only a minority of the genetic material is kept in circular chromosomes while the rest is in branched, linear, or other complex structures.
Curiously, around half of the protein products of transferred genes aren't even targeted back to the chloroplast. Many became exaptations, taking on new functions like participating in cell division, protein routing, and even disease resistance. A few chloroplast genes found new homes in the mitochondrial genome—most became nonfunctional pseudogenes, though a few tRNA genes still work in the mitochondrion. Some transferred chloroplast DNA protein products get directed to the secretory pathway (though many secondary plastids are bounded by an outermost membrane derived from the host's cell membrane, and therefore topologically outside of the cell, because to reach the chloroplast from the cytosol, you have to cross the cell membrane, just like if you were headed for the extracellular space. In those cases, chloroplast-targeted proteins do initially travel along the secretory pathway).
Because the cell acquiring a chloroplast already had mitochondria (and peroxisomes, and a cell membrane for secretion), the new chloroplast host had to develop a unique protein targeting system to avoid having chloroplast proteins being sent to the wrong organelle.
Chloroplast transit peptides exhibit huge variation in length and amino acid sequence. They can be from 20 to 150 amino acids long—an unusually long length, suggesting that transit peptides are actually collections of protein domain with different functions. Transit peptides tend to be positively charged, rich in hydroxyl group amino acids such as serine, threonine, and proline, and poor in acidic amino acids like aspartic acid and glutamic acid. In an aqueous solution, the transit sequence forms a random coil.
Not all chloroplast proteins include a N-terminal cleavable transit peptide though. Some include the transit sequence within the mature protein of the protein itself. A few have their transit sequence appended to their C-terminus instead. Most of the polypeptides that lack N-terminal targeting sequences are the ones that are sent to the outer chloroplast membrane, plus at least one sent to the inner chloroplast membrane.
Phosphorylation changes the polypeptide's shape, making it easier for 14-3-3 proteins to attach to the polypeptide. In plants, 14-3-3 proteins only bind to chloroplast preproteins. It is also bound by the heat shock protein Hsp70 that keeps the polypeptide from protein folding prematurely. This is important because it prevents chloroplast proteins from assuming their active form and carrying out their chloroplast functions in the wrong place—the cytosol. At the same time, they have to keep just enough shape so that they can be recognized and imported into the chloroplast.
The heat shock protein and the 14-3-3 proteins together form a cytosolic guidance complex that makes it easier for the chloroplast polypeptide to get imported into the chloroplast.
Alternatively, if a chloroplast preprotein's transit peptide is not phosphorylated, a chloroplast preprotein can still attach to a heat shock protein or Toc159. These complexes can bind to the TOC complex on the outer chloroplast membrane using GTP energy.
The first three proteins form a core complex that consists of one Toc159, four to five Toc34s, and four Toc75s that form four holes in a disk 13 nanometers across. The whole core complex weighs about 500 kilodaltons. The other two proteins, Toc64 and Toc12, are associated with the core complex but are not part of it.
Toc34's job is to catch some chloroplast preproteins in the cytosol and hand them off to the rest of the TOC complex. When GTP, an energy molecule similar to ATP attaches to Toc34, the protein becomes much more able to bind to many chloroplast preproteins in the cytosol. The chloroplast preprotein's presence causes Toc34 to break GTP into guanosine diphosphate (GDP) and inorganic phosphate. This loss of GTP makes the Toc34 protein release the chloroplast preprotein, handing it off to the next TOC protein. Toc34 then releases the depleted GDP molecule, probably with the help of an unknown GDP exchange factor. A protein domain of Toc159 might be the exchange factor that carry out the GDP removal. The Toc34 protein can then take up another molecule of GTP and begin the cycle again.
Toc34 can be turned off through phosphorylation. A protein kinase drifting around on the outer chloroplast membrane can use ATP to add a phosphate group to the Toc34 protein, preventing it from being able to receive another GTP molecule, inhibiting the protein's activity. This might provide a way to regulate protein import into chloroplasts.
Arabidopsis thaliana has two homologous proteins, AtToc33 and AtToc34 (The At stands for Arabidopsis thaliana), which are each about 60% identical in amino acid sequence to Toc34 in peas (called psToc34). AtToc33 is the most common in Arabidopsis, and it is the functional analogue of Toc34 because it can be turned off by phosphorylation. AtToc34 on the other hand cannot be phosphorylated.
Toc159 probably works a lot like Toc34, recognizing proteins in the cytosol using GTP. It can be regulated through phosphorylation, but by a different protein kinase than the one that phosphorylates Toc34. Its M-domain forms part of the tunnel that chloroplast preproteins travel through, and seems to provide the force that pushes preproteins through, using the energy from GTP.
Toc159 is not always found as part of the TOC complex—it has also been found dissolved in the cytosol. This suggests that it might act as a shuttle that finds chloroplast preproteins in the cytosol and carries them back to the TOC complex. There isn't a lot of direct evidence for this behavior though.
A family of Toc159 proteins, Toc159, Toc132, Toc120, and Toc90 have been found in Arabidopsis thaliana. They vary in the length of their A-domains, which is completely gone in Toc90. Toc132, Toc120, and Toc90 seem to have specialized functions in importing stuff like nonphotosynthetic preproteins, and can't replace Toc159.
Toc75 can also bind to chloroplast preproteins, but is a lot worse at this than Toc34 or Toc159.
Arabidopsis thaliana has multiple isoforms of Toc75 that are named by the chromosomal positions of the genes that code for them. AtToc75 III is the most abundant of these.
Like the Toc translocon, the TIC translocon has a large core Protein complex surrounded by some loosely associated peripheral proteins like Tic110, Tic40, and Tic21. The core complex weighs about one million daltons and contains Tic214, Tic100, Tic56, and Tic20 I, possibly three of each.
Unlike Tic214, Tic100, or Tic56, Tic20 has homologous relatives in cyanobacteria and nearly all chloroplast lineages, suggesting it evolved before the first chloroplast endosymbiosis. Tic214, Tic100, and Tic56 are unique to chloroplastidan chloroplasts, suggesting that they evolved later.
Tic56 and Tic100 are highly conserved among land plants, but they don't resemble any protein whose function is known. Neither has any transmembrane domains.
|
|